Parallel Two-Stage Hessenberg Reduction using Tile Algorithms for Multicore Architectures

نویسندگان

  • Hatem Ltaief
  • Jakub Kurzak
  • Jack Dongarra
چکیده

This paper describes a parallel Hessenberg reduction in the context of multicore architectures using tile algorithms. The Hessenberg reduction is very often used as a pre-processing step in solving dense linear algebra problems, such as the standard eigenvalue problem. Although expensive, orthogonal transformations are accepted techniques and commonly used for this reduction because they guarantee stability, as opposed to elementarily transformations similar to what is used in Gaussian elimination. The state of the art, high performance dense linear algebra software libraries, i.e., LAPACK and ScaLAPACK, reduce the matrix to Hessenberg form through a one-stage process by using block Householder approach with compact WY representation. However, the main drawback of the tile algorithms approach for the Hessenberg reduction is that the full reduction can not be obtained similarly in a one-stage standard process. A two-stage approach has to be considered. The first stage corresponds to a redesign of the block Hessenberg matrix reduction, introduced by Dongarra et. al [12], to benefit from tile algorithms in the multicore environment. The second stage further reduces the matrix bandwidth to achieve the required Hessenberg form using a parallel bulge chasing procedure. On the one hand, by exploiting the concepts of tile algorithms in the multicore environment, the block Hessenberg reduction (first stage) achieves 72% of the DGEMM peak on a 12000× 12000 matrix with 16 Intel Tigerton 2.4 GHz processors. On the other hand, the parallel bulge chasing procedure (second stage) is not appropriate for tile algorithms and therefore poorly performs on multicore architectures and slows down dramatically the overall algorithm.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Scheduling two-sided transformations using tile algorithms on multicore architectures

The objective of this paper is to describe, in the context of multicore architectures, three different scheduler implementations for the two-sided linear algebra transformations, in particular the Hessenberg and Bidiagonal reductions which are the first steps for the standard eigenvalue problems and the singular value decompositions respectively. State-of-the-art dense linear algebra softwares,...

متن کامل

Scheduling Two-sided Transformations using Algorithms-by-Tiles on Multicore Architectures LAPACK Working Note #214

The objective of this paper is to describe, in the context of multicore architectures, different scheduler implementations for the two-sided linear algebra transformations, in particular the Hessenberg and Bidiagonal reductions which are the first steps for the standard eigenvalue problems and the singular value decompositions respectively. State-of-the-art dense linear algebra softwares, such ...

متن کامل

Parallel Block Hessenberg Reduction using Algorithms-By-Tiles for Multicore Architectures Revisited LAPACK Working Note #208

The objective of this paper is to extend and redesign the block matrix reduction applied for the family of two-sided factorizations, introduced by Dongarra et al. [9], to the context of multicore architectures using algorithms-by-tiles. In particular, the Block Hessenberg Reduction is very often used as a pre-processing step in solving dense linear algebra problems, such as the standard eigenva...

متن کامل

Parallel Block Hessenberg Reduction using Algorithms-By-Tiles for Multicore Architectures Revisited

The objective of this paper is to extend and redesign the block matrix reduction applied for the family of two-sided factorizations, introduced by Dongarra et al. [9], to the context of multicore architectures using algorithms-by-tiles. In particular, the Block Hessenberg Reduction is very often used as a pre-processing step in solving dense linear algebra problems, such as the standard eigenva...

متن کامل

Enhancing Parallelism of Tile Bidiagonal Transformation on Multicore Architectures Using Tree Reduction

Abstract. The objective of this paper is to enhance the parallelism of the tile bidiagonal transformation using tree reduction on multicore architectures. First introduced by Ltaief et. al [LAPACK Working Note #247, 2011], the bidiagonal transformation using tile algorithms with a two-stage approach has shown very promising results on square matrices. However, for tall and skinny matrices, the ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2009